Inference of Extreme Synchrony with an Entropy Measure on a Bipartite Network 
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This article proposes a method to quantify the structure of a bipartite graph with a network 
entropy from a statistical-physical point of view. The network entropy of a bipartite graph with 
random links is computed from numerical simulation. As an application of the proposed method to 
analyze collective behavior, the affairs in which participants quote and trade in the foreign exchange 
market are quantified. The network entropy per node is found to correspond to the macroeconomic 
situation. A finite mixture of Gumbel distributions is used to fit with the empirical distribution for 
the minimum values of network entropy per node in each week. The mixture of Gumbel distributions 
with parameter estimates by segmentation procedure is verified by Kolmogorov-Smirnov test. The 
finite mixture of Gumbel distributions can extrapolate the probability of extreme events that have 
never been observed. 



The network structure of various kinds of physical and 
social systems has attracted considerable research atten- 
tion. A many-body system can be described as a net- 
work, and the nature of growing networks has been ex- 
amined well [l], 0. Power- law properties can be found 
in the growing networks, which are called complex net- 
works. These properties are related to growth of elements 
and preferential attachment 

A network consists of several nodes and links that 
connect nodes. In literature on the physics of socio- 
economic systems [s^, nodes are assumed to represent 
agents, goods, and computers, and links express the re- 
lationships between nodes [3, 0]. Network structure is 
perceived in many cases through conveyance of informa- 
tion, knowledge, and energy, among others. 

Researchers have used a methodology to character- 
ize the network structure with information-theoretic en- 
tropy d, According to the study of Dehmer and 
Mowshowitz 0, the concept of graph entropy was first 
proposed in the^ 1950s to measure structural complex- 
ity. Rashevsky [8] , Trucco @ , and Mowshowitz were 
the first researchers to define and investigate the entropy 
of graphs. Several graph invariants such as the number 
of vertices, vertex degree sequence, and extended degree 
sequences have been used in the construction of entropy- 
based measures. 

Wilhelm and Hollunder proposed a method to charac- 
terize directed weighted networks with several nodes Q. 
They considered the normalized weight of the fiux be- 
tween two nodes as the probability of a symbol in the 
transmitter signal that corresponds to the sum of all in- 
fluxes to/effluxes from a given node. Sato also considered 
information-theoretic measures for a bipartite graph and 
inferred economic situations using the network entropy 
of relative frequencies among group populations (Tl| . 

In statistical physics, the number of combinations of 
possible configurations under given energy constraints 
is related to "entropy." The entropy is a measure that 
quantifies the states of thermodynamic systems. In phys- 
ical systems, the entropy naturally increases because of 



the thermal fluctuations on elements. Boltzmann pro- 
posed that entropy 5 is computed from the possible num- 
ber of ensembles ghy S = log 5. For a system that con- 
sists of two sub-systems whose respective entropies are 
Si and S2 the total entropy S is calculated as the sum of 
ones of two sub-systems -I- 5*2. This case is attributed 
to the possible number of ensembles gig2- The entropy 
in statistical physics is also related to the degree of com- 
plexity of a physical system. If the entropy is low (high) , 
then the physical configuration is rarely (often) realized. 
Energy injection or work in an observed system may be 
assumed to realize rare situations. 

The concept of statistical-physical entropy was applied 
by Bianconi fl^l to measure network structure. He con- 
sidered that the complexity of a network is related to 
the number of possible configurations of nodes and links 
under some constraints determined by observations. He 
calculated the network entropy of an arbitrary network 
in several cases of constraints. 

The number of elements in socio-economic systems is 
usually very large, and several restrictions or finiteness of 
observations can be found. Therefore, we need to develop 
a method to infer or quantify the affairs of the entire net- 
work structure from partial observations. Specifically, 
many affiliation relationships of socio-economic systems 
can be expressed as a bipartite network. For example, 
a symmetric binary 2-mode network can be constructed 
by linking M participants and K groups if these partic- 
ipants belong to the groups. Assume that we can count 
the number of participants in each group within the time 
window [tS, {t + 1)5] {t = 1, 2, 3, . . .), which is defined 
as mi{t) {i — 1,2,...,K). How do we measure the 
complexity of the bipartite graph from (t) at each ob- 
servation time tl 

Such structure can be expressed as a bipartite graph. 
Describing the network structure of complex systems that 
consist of two types of nodes by using the bipartite net- 
work is important. A bipartite graph model also can be 
used as a general model for complex networks [T3l - [l5| . 
Tumminello et al. proposed a statistical method to vali- 
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date the heterogeneity of bipartite networks [Ts'l . 

Let us assume a bipartite graph consisting of A nodes 
and B nodes, of which structure at time t is described 
as an adjacency matrix Cij{t). We assume that A nodes 
are observable and B nodes are unobservable. That is, we 
only know the number of links at A nodes rrii it) . We do 
not know the correct number of B nodes, but we assume 
that it is M . Then the number of possible configurations 
under constraint mi{t) = X]t=i ^iji^) may be counted as 
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Following the concept of Bianconi [T4], the network 
entropy at time t is defined as = IniV(t). Inserting 
Eq. ([T]) into this definition, we have 

M K M-rUiit) K mi(t) 

m^Kj2\nn-Y^ Inn 5^ Inn. (2) 

n— 1 i—1 n—1 i—1 n—1 

Note that because 0! = 1, Y^n=i^'^''^ = 0. Obviously, 
if ki — M for any i, then = 0. If fc^ = for any 

j, then — 0. The fewer number of combinations 

gives a lower value of Hence, the entropy per link 

is defined as 



(3) 



This quantity shows the degree of complexity of the bi- 
partite network structure. We may capture the temporal 
development of the network structure from the value of 
a{t). 
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FIG. 1. (a) Plots between a{t) and degree of monopolization 
k. (b) Plots between a{t) and density of links p. 

To understand the fundamental properties of Eq. ^ , 
we compute a(t) in simple cases. Consider values of en- 
tropy for several cases a.t K = 100 with different M. We 
assume that the total number of links is fixed as 100, 
which is the same as the number of A nodes, and we 
confirm the dependence of a{t) on the degree of mo- 
nopolization. We assign the same number of links at 
each A node. That is, we set rni(i) = 100/fc (fc — 
1,2,4,5,10,20,50,100) for i = 1, . . . , fc and m^{t) = 
for i = k + 1, . . . , K. Fig. [1] (a) shows the relation- 
ship between a{t) and the degree of monopolization at 
M = 1,000, 2,000, 3,000, and 4,000. a{t) is small if a 
small population of nodes occupy a large number of links. 



The multiply regime gives a large value of (j{t). The value 
of a{t) is a monotonically increasing function in terms of 
k. As M increases, the value of a{t) increases. From 
this instance, we confirmed that a{t) decreases with the 
degree of monopolization at A nodes. 

Next, we confirm the dependency of a{t) on the den- 
sity of links. We assume that each element of an ad- 
jacency matrix Cij(t) is given by an i.i.d. Bernoulli 
random variable with successful probability p. Then 
fni{t) — X^jli C'ij(t) is sampled from an i.i.d. binomial 
distribution Bin{p, M). Fig. [Ijb) shows the plots of <j{t) 
versus p. The number of links at each A node monoton- 
ically increases as p increases. a{t) decreases as the den- 
sity of links decreases. The dependence of the entropy 
per node on p is independent of M. 

The application of network analysis to financial time 
series has been advancing. Several researchers investi- 
gated the network structure of financial markets 
[l9|. Bonanno et al. examined the topological character- 
ization of the correlation-based minimum spanning tree 
(MST) of real data [T^. Gworek et al. analyzed the 
exchange rates returns of 38 currencies (including gold), 
and computed the characteristic path length and aver- 
age weighted clustering coefficient of MST topology of 
the graph extracted from the cross-correlations for sev- 
eral base currencies [ij. Podnik et al. [1^ examined 
the cross-correlations between volume changes and price 
changes for the New York Stock Exchange, the Standard 
and Poor's 500 index, and 28 worldwide financial indices, 
lori et al. [l^ analyzed the network topology of the Ital- 
ian segment of the European overnight money market 
and investigated the evolution of these banks' connectiv- 
ity structure over the maintenance period. These studies 
collectively aimed to detect the susceptibility of network 
structure to macroeconomic situations. 

The proposed method based on statistical-physical en- 
tropy is applied to measure the states of the foreign ex- 
change market. The relationship between a bipartite 
network structure and macroeconomic shocks or crises 
was investigated, and the occurrence probabilities of ex- 
treme synchrony were inferred. The data collected from 
the ICAP EBS platform were used. The data period 
spanned May 28, 2007 to May 31, 2012 throughout in- 
vestigation |20ll. The data included records for orders 
(BID/OFFER) and transactions of currencies and pre- 
cious metals with one-second resolution. The data set 
involved 93 currency pairs consisting of 39 currencies, 11 
precious metals, and 2 basket currencies (NZD, AUD, 
AUQ, JPY, HKD, CNH, SGD, THB, BHD, TRY, ILS, 
AED, KWD, SAR, KES, KET, ZAR, RUB, KZA, KZT, 
RON, PLN, HUE, SEK, SKK, CZK, DKK, NOK, CHE, 
EUR, EUQ, GBP, GBQ, ISK, CAD, USD, DLR, USQ, 
MXN, MXC, MXT, XAG, BAG, XAU, BAU, SAU, XPD, 
XPT, BKT, BKQ, LPD, and LPT). 

The number of quotations and transactions was ex- 
tracted from the raw data. Let mx,i{t) {t — 0,...,) 
be the number of quotations {X = P) or transactions 
{X — D) within every minute ((5 = 1 [min]) at currency 
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pair i. We compute a statistical-physical entropy per 
node from mx,^{t){X G {P,D}) with Eqs. © and ©, 
which are denoted as ax{t)- 

ap{t) and aoit) in European and American time zones 
are normally smaller than (Tp{t) and aoit) in Asian time 
zone. This result indicates the regional dependence of 
participants who actively behave. Specifically ap{t) and 
aoit) exhibits several peaks in each central region, and 
rapidly increase from European to American time zones. 

Let us consider the minimum value of crx(^) every 
week: 



vx{s) = min {ax{t)}, 

tew{s) 



(4) 



where W{s) (s = 1, . . . , T) represents a set of time in- 
cluded in s-th week. A total of 268 weeks are observed in 
the data set (T — 268). According to the extreme value 
theorem, the probability density for minimum values can 
be assumed to be a Gumbel density: 



P{vx;f^x,Px) = — exp 
Px 



vx + Px 

PX 



(5) 



where p,x and px are the location and scale param- 
eters, respectively. Under the assumption of Gumbel 
distribution, these parameters are estimated with maxi- 
mum likelihood procedure. The parameter estimates are 
given as pp = —4.85 and pp = 0.101, and are given as 
jjLpi = —4.99 and ppi = 0.111. Kolmogorov-Smirnov test 
is conducted to determine statistical significance of the 
estimated distributions. The p- value of the distribution 
estimated for quotations is 0.005. The p-value of the dis- 
tribution estimated for transactions is 0.02. Hence, the 
stationary Gumbel assumption cannot explain the syn- 
chronizations observed in transactions at 5% significance 
level. 

The literature to detect structural breaks or change 
points in an economic time series ^H-Il^l points out that 
nonstationary time series are constructed from locally 
stationary segments sampled from different distribu- 
tions. Goldfeld and Quandt conducted a pioneering 
work on the separation of stationary segments [2]| . 
Recently, a hierarchical segmentation procedure was 
also proposed by Choeng et al. under the Gaussian 
assumption [23|. We applied this concept to define the 
segments for vx{s) (s — 1, . . . , T). Denoting likelihood 
functions as Li ~ Hs'^i A^: P): £2(5) = 
n^'^i P{vx{s'),pL, Pl) n:t\ P{vx{s'),PR, PR), the 
difference between the log-likelihood functions can be 
defined as 



A(s) = log 2.2(5) - logLi. 



(6) 



Generally, A(s) is approximated as Jensen-Shannon di- 
vergence, which is defined as JS'^j_7r2 [pijP2] = H[t^iPi + 
'^2P2] ~ 7rii?[pi] — 7r2-ff [P2] with the weights < tti < 1 
and < 7r2 < 1 (tti +7:2 = 1), and H[p\ as Shannon 
entropy H[p\ = — dvlogp{v)p{v). Namely, we have 



A(s)/T « JS^ 



P{vx;Pl,Pl),P{vx;Pr,Pr) , (7) 



where the weight ttl = s/T and ttr = {T — s)/T. 

Therefore, P{vx] PLt Pl) is maximally different from 
P{vx \ PR, Pr) when A(s) assumes a maximal value. This 
spectrum has a maximum at some time s*, which is de- 
noted as 



A* = A(s*) = maxA(s) 



(8) 



The segmentation can be used recursively to separate 
the time series into further smaller segments. We do this 
iteratively until all segment boundaries have converged 
onto their optimal segment that is defined by a stopping 
(termination) condition. 

Several termination conditions were discussed in pre- 
vious studies. Following Cheong et al.'s study [2^, we 
terminate the iteration if A* is less than a typical con- 
servative threshold of Aq = 10, while the procedure is re- 
cursively conducted if A* is larger than Aq. We checked 
the robustness of this segmentation procedure for Aq. 

Let the number of segments be Lx , the parameter esti- 
mates {px,j, Px,j} at the j-th segment, and the length of 
the j-th segment Tx,j, where txj = T. The cumu- 

lative probability distribution for vx{s) (s = 1, . . . ,T) 
may be assumed as a finite mixture of Gumbel distribu- 
tions: 



Pt{Vx <vx)=Y.^{l-e^p[e ]}, (9) 





(a) 



(b) 



FIG. 2. Temporal development of (a) vp{s) and (b) vd{s) 
from May 28, 2007 to May 31, 2012. 

Fig. [5] shows the temporal development of vp{s) and 
vois) from May 28, 2007 to May 31, 2012. Lp = 13 
and Lp) = 8 are obtained from vx{s) with the proposed 
segmentation procedure. During the observation period, 
the global financial system suffered from the following 
significant macroeconomic shocks and crises: (I) Paribas 
shock (August 2007), (II) Bear Stearns shock (February 

2008) , (III) Lehman shock (September 2008 to March 

2009) and (IV) Euro crisis (April to May 2010), and (V) 
Great East Japan Disaster (March 2011). Before enter- 
ing these global affairs, both vp{s) and vp){t) took large 
values. Note that during the Paribas shock, the Bear 
Stearns shock and the Lehman shock took smaller values 
than they did during the previous term. This implies that 
the global shock may drive many participants, and these 
participants may trade the same currencies at the same 
moment. The smallest values correspond to the days in 
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the (II) Bear Stearns shock, (III) the Lehman shock, and 
(VI) the Euro crisis. These days are generally related to 
the start or the end of macroeconomic shocks or crises. 
The period from December 2011 to March 2012 shows 
that the values of vd{s) are smaller than they were dur- 
ing other periods. This result implies that during said 
period, singular patterns appeared in the transactions. 
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FIG. 3. Cumulative distribution functions for the minimum 
values of the entropy per node in each week vp{s) and vd{s). 
Filled squares represent the empirical distribution of vp{s), 
and unfilled circles represent the empirical distribution of 
vd{s). a solid curve represents the estimated distribution 
of vp{s), and a dashed curve represents the estimated distri- 
bution of vd{s). 

Fig. 13] shows the cumulative distribution functions of 
vp{s) and vd{s). Kolmogorov-Smirnov test verifies this 
mixing assumption. The distribution estimated by the 
finite mixture of Gumbel distributions for quotations is 



well fitted, where the p- value is 0.11. The distribution es- 
timated for transactions is also the well-fitted, where the 
p- value is 0.72. From these p- values, the mixture of Gum- 
bel distributions can explain both the vp{s) and vd{s). 
Hence, the occurrence probability of extreme events that 
have never been observed can be inferred with the use of 
the finite mixture Gumbel distributions with parameter 
estimates. 

A method that is based on the concept of "entropy" 
in statistical physics was proposed to quantify states of 
a bipartite network under constraints. The statistical- 
physical network entropy of a bipartite network was de- 
rived under the constraints for the number of links at 
each group node. Numerical simulation for a binary bi- 
partite graph with random links showed that the network 
entropy per link can capture both density and concen- 
tration of links in the bipartite network. The proposed 
method were applied to measure a bipartite network that 
consists of currency pairs and participants in the foreign 
exchange market. The empirical investigation confirmed 
that the entropy per link decreased before and after the 
latest global shocks infiuenced the world economy. A 
method was proposed to determine segments with recur- 
sive segmentation based on Jensen-Shannon divergence 
between Gumbel distributions. Under the assumption of 
a finite mixture of Gumbel distributions, the estimated 
distributions was verified by Kolmogorov-Smirnov test. 
The finite mixture of Gumbel distributions can estimate 
the occurrence probabilities of extreme synchrony of a 
nonstationary system extracted as a bipartite network. 

This work was supported by the Grant-in-Aid for 
Young Scientists (B) (#23760074) by the Japanese Soci- 
ety of Promotion of Science (JSPS). 
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